EN FR
EN FR


Project Team Grand-large


Bibliography


Project Team Grand-large


Bibliography


Section: New Results

MIcrowave Data Analysis for petaScale computers

Participants : Laura Grigori, Mikolaj Szydlarski, Meisam Shariffy.

In [47] we describe an scalable algorithm for computing an inverse spherical harmonic transform suitable for cluster of multiple CPU-GPUs. We base our implementation on hybrid programming combining MPI and CUDA. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the OpenMP/CUDA-based code and compare them with those implemented in the public domain parallel package, S2HAT.

We also present performance comparisons of the multi GPU version and a hybrid, MPI/OpenMP version of the same transform. We find that one NVIDIA Tesla S1070 can accelerate overall execution time of the SHT by as much as 3 times with respect to the MPI/OpenMP version executed on one quad-core processor (Intel Nehalem 2.93 GHz) and, owing to very good scalability of both versions, 128 Tesla cards perform as good as 256 twelve-core processor (AMD Opteron 2.1 GHz).

The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability.